Dragon Systems’ 1998 Broadcast News Transcription System for ̋mandarin

نویسندگان

  • Puming Zhan
  • Steven Wegmann
  • Larry Gillick
چکیده

In this paper we shall describe Dragon Systems’ 1998 Broadcast News transcription system for Mandarin. We shall describe our music classifier, which was unique to our Mandarin system, as well as our speaker change detection algorithm, ̋ which was used in our English and Mandarin systems. We shall also report on preliminary, post-evaluation experiments with pitch.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dragon systems' 1998 broadcast news transcription system

In this paper we shall describe key improvements to Dragon’s Broadcast News Transcription System, which include: the addition of a speaker-change detection algorithm to our preprocessing subsystem, a new diagonalizing transformation trained using semi-tied covariances, and the addition of probabilities on pronunciations. This new transcription system yields a word error rate of 15.2% on the 199...

متن کامل

Segmentation of Automatically Transcribed Broadcast News Text

Expertise in the automatic transcription of broadcast speech has progressed to the point of being able to use the resulting transcripts for information retrieval purposes. In this paper, we describe the Segmentation system used by Dragon Systems in the Segmentation task of the 1998 TDT evaluation, highlighting improvements made since the September 1998 dryrun. Segmentation of closed-caption and...

متن کامل

Text segmentation and topic tracking on broadcast news via a hidden Markov model approach

Continuing progress in the automatic transcription of broadcast speech via speech recognition has raised the possibility of applying information retrieval techniques to the resulting (errorful) text. In this paper we describe a general methodology based on Hidden Markov Models and classical language modeling techniques for automatically inferring story boundaries (segmentation) and for retrievi...

متن کامل

Dragon Systems’ Automatic Transcription of New Tdt Corpus

Dragon Systems has agreed to provide automatically generated transcripts for around 1000 hours of Broadcast News, annotated with word-level time-markings and confidence estimates. Pilot transcripts of about 30 hours of data will be available sometime in February, with the project completed by mid-July. In this paper, we describe how we took our 1997 Hub4 evaluation system which ran at around 14...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999